Low-rank Bandits with Latent Mixtures

نویسندگان

Aditya Gopalan

Odalric-Ambrym Maillard

Mohammadi Zaki

چکیده

We study the task of maximizing rewards from recommending items (actions) to users sequentially interacting with a recommender system. Users are modeled as latent mixtures of C many representative user classes, where each class specifies a mean reward profile across actions. Both the user features (mixture distribution over classes) and the item features (mean reward vector per class) are unknown a priori. The user identity is the only contextual information available to the learner while interacting. This induces a low-rank structure on the matrix of expected rewards ra,b from recommending item a to user b. The problem reduces to the well-known linear bandit when either useror item-side features are perfectly known. In the setting where each user, with its stochastically sampled taste profile, interacts only for a small number of sessions, we develop a bandit algorithm for the two-sided uncertainty. It combines the Robust Tensor Power Method of Anandkumar et al. (2014b) with the OFUL linear bandit algorithm of Abbasi-Yadkori et al. (2011). We provide the first rigorous regret analysis of this combination, showing that its regret after T user interactions is Õ(C √ BT ), with B the number of users. An ingredient towards this result is a novel robustness property of OFUL, of independent interest.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latent Contextual Bandits: A Non-Negative Matrix Factorization Approach

We consider the stochastic contextual bandit problem with a large number of observed contexts and arms, but with a latent low-dimensional structure across contexts. This low dimensional (latent) structure encodes the fact that both the observed contexts and the mean rewards from the arms are convex mixtures of a small number of underlying latent contexts. At each time, we are presented with an ...

متن کامل

Stochastic Low-Rank Bandits

Many problems in computer vision and recommender systems involve low-rank matrices. In this work, we study the problem of finding the maximum entry of a stochastic low-rank matrix from sequential observations. At each step, a learning agent chooses pairs of row and column arms, and receives the noisy product of their latent values as a reward. The main challenge is that the latent values are un...

متن کامل

Contextual Bandits with Latent Confounders: An NMF Approach

Motivated by online recommendation and advertising systems, we consider a causal model for stochastic contextual bandits with a latent low-dimensional confounder. In our model, there are L observed contexts and K arms of the bandit. The observed context influences the reward obtained through a latent confounder variable with cardinality m (m ⌧ L, K). The arm choice and the latent confounder cau...

متن کامل

Nice latent variable models have log-rank

Matrices of low rank are pervasive in big data, appearing in recommender systems, movie preferences, topic models, medical records, and genomics. While there is a vast literature on how to exploit low rank structure in these datasets, there is less attention on explaining why the low rank structure appears in the first place. We explain the abundance of low rank matrices in big data by proving ...

متن کامل

Latent Contextual Bandits and their Application to Personalized Recommendations for New Users

Personalized recommendations for new users, also known as the cold-start problem, can be formulated as a contextual bandit problem. Existing contextual bandit algorithms generally rely on features alone to capture user variability. Such methods are inefficient in learning new users’ interests. In this paper we propose Latent Contextual Bandits. We consider both the benefit of leveraging a set o...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1609.01508 شماره

صفحات -

تاریخ انتشار 2016

Low-rank Bandits with Latent Mixtures

نویسندگان

چکیده

منابع مشابه

Latent Contextual Bandits: A Non-Negative Matrix Factorization Approach

Stochastic Low-Rank Bandits

Contextual Bandits with Latent Confounders: An NMF Approach

Nice latent variable models have log-rank

Latent Contextual Bandits and their Application to Personalized Recommendations for New Users

عنوان ژورنال:

اشتراک گذاری